Domain-Specific Word Sense Disambiguation combining corpus based and wordnet based parameters
نویسندگان
چکیده
We present here an algorithm for domain specific all-words WSD. The scoring function to rank the senses is inspired by the quadratic energy expression of Hopfield network, a well studied expression in neural networks. The scoring function is employed by a greedy iterative disambiguation algorithm that uses only the wordsdisambiguated-so-far to disambiguate the current word in focus. The combination of the algorithm and the scoring function seems to perform well in two ways: (i) the algorithm beats the domain corpus baseline which is typically hard to beat, and (ii) the algorithm is a good balance between efficiency and performance. The latter fact is established by comparing the iterative algorithm with a PageRank like disambiguation algorithm and an exhaustive sense graph search algorithm. The accuracy values of approximately 69% (F1-score) in two different domainswhere the domain corpus baseline stands at 65%compares very well with the state of the art.
منابع مشابه
Domain-Specific Semantic Class Disambiguation Using WordNet
This paper presents an approach which exploits general-purpose algori.t~m~ and resources for domain-specific semantic class dis~mhiguation, thus facilitating the generalization of semautic patterns fTom word-based to class-based representations. Through the mapping of the donza£uspecific semantic hierarchy onto WordNet and the application of general-purpose word sense disambiguation and semanti...
متن کاملGoogle & WordNet based Word Sense Disambiguation
This paper presents an unsupervised methodology for automatic disambiguation of noun terms found in domain specific unrestricted corpora. This method extends approaches of Fragos (Fragos et al., 2003) and others that use the WordNet (Miller, 1998) database in order to resolve semantic ambiguity. The method is evaluated by disambiguating the noun collection of SemCor 2.0. Parameter adjustment wa...
متن کاملMapping WordNet Domains, WordNet Topics and Wikipedia Categories to Generate Multilingual Domain Specific Resources
In this paper we present the mapping between WordNet domains and WordNet topics, and the emergent Wikipedia categories. This mapping leads to a coarse alignment between WordNet and Wikipedia, useful for producing domain-specific and multilingual corpora. Multilinguality is achieved through the cross-language links between Wikipedia categories. Research in word-sense disambiguation has shown tha...
متن کاملWord Sense vs. Word Domain Disambiguation: A Maximum Entropy Approach
In this paper, a supervised learning system of word sense disambiguation is presented. It is based on conditional maximum entropy models. This system acquires the linguistic knowledge from an annotated corpus and this knowledge is represented in the form of features. The system were evaluated both using WordNet’s senses and domains as the sets of classes of each word. Domain labels are obtained...
متن کاملKnowledge-Based WSD and Specific Domains: Performing Better than Generic Supervised WSD
This paper explores the application of knowledgebased Word Sense Disambiguation systems to specific domains, based on our state-of-the-art graphbased WSD system that uses the information in WordNet. Evaluation was performed over a publicly available domain-specific dataset of 41 words related to Sports and Finance, comprising examples drawn from three corpora: one balanced corpus (BNC), and two...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009